perm filename LIBRAR.RED[F83,JMC] blob sn#727158 filedate 1983-10-08 generic text, type C, neo UTF8
COMMENT āŠ—   VALID 00002 PAGES
C REC  PAGE   DESCRIPTION
C00001 00001
C00002 00002	āˆ‚02-Oct-83  1618	Janet.Asbury@CMU-CS-A 	Electronic Library    
C00061 ENDMK
CāŠ—;
āˆ‚02-Oct-83  1618	Janet.Asbury@CMU-CS-A 	Electronic Library    
Received: from CMU-CS-A by SU-AI with TCP/SMTP; 2 Oct 83  16:15:01 PDT
Received: from [128.2.254.192] by CMU-CS-PT with CMUFTP;  2 Oct 83 19:01:48 EDT
Date:  2 Oct 83 1911 EDT (Sunday)
From: Jan.Asbury <Janet.Asbury@CMU-CS-A>
To: JMC@SU-AI
Subject: Electronic Library
CC: Janet.Asbury@CMU-CS-A
Message-Id: <02Oct83.191101.JA61@CMU-CS-A>

Dear Dr. McCarthy,

I just checked my electronic mail and discovered that the message I sent
you on Friday  never reached you because I mistakenly addressed it to JCM,
not JMC.  I am so sorry for the error.  Professor Reddy had said he was
hoping to receive a revised version of the proposal from you by tomorrow
so he can send it on.  As you will see, there are only minor changes between
this version and the first version we sent you.  The word 'Physical' in the
first line has been changed to 'conventional', some section headings have
been changed to subsections, and Section 3.3 reads 'Technical Issues' instead
of 'Technical Problems'.  I am sending an MSS file and a DOC file.  Please
make any revisions on the MSS file.

Jan Asbury
Secretary to Professor Reddy

@device(dover)
@make(article)
@heading(A PILOT PROJECT IN ELECTRONIC LIBRARIES)
@center(J. McCarthy, M. Griffith, R. Reddy))

@section(SUMMARY)

@section(PROBLEM)

An Electronic Library, as the name suggests, is the electronic
equivalent of a conventional library.  It is expected that when a 
system capable of fulfilling all the functions of a real library
is developed, it can provide many new functions that are currently
unavailable:  instantaneous access from anyone's home or office,
multiple access by many people so that a book is never out, reduced
cost, language translation aids, the ability to search text and to
make excerpts and indexes and the integration of information retrieval
and more conventional reading.

Constructing an operational Electronic Library poses a number of
technical, social and legal problems at present.  The purpose of the
Pilot Project is to bypass many of these problems for now, and
concentrate on a few core problems which do not require any new
technical breakthroughs.  The key problems to be studied will
involve the questions of how to acquire, represent, transmit, and
present Nineteenth Century French Literature.  By selecting NCFL,
we finesse the issue of copyright, the problems of formulas and
drawings, and increase the availability of French Literature.

@section(BACKGROUND AND NEED)

For some time it has been cost-effective to put an entire
national library into a computer file and make all its resources
available to anyone in the country with a computer terminal.  There
is no need to argue that all printed paper will be abolished, but
many people would get rid of ninety percent of their
books and magazines if they could access electronically from a
terminal at home.

It is now possible to get a gigabyte disk module for 
under $20,000.  If we count a book as 500,000 bytes, then this
module can store 2000 books.  The space occupied by the module
would store about 300 books on shelves.  The cost comes to $10 per
book.  Recent word information compression would give another factor
of four in storage density, reducing the cost to approximately
$2.50 per book and reducing the storage volume to @i(one twenty-fourth)
of that required to store books on shelves.

Books cost much more these days and so does the space required to store
them, although the cost of cataloging the books is apparently larger
than either.

Recently IBM (3380) announced disk files storing 2.52 gigabytes per
unit which would store 20,000 compressed books in the space taken
by 300 on shelves.  The U.S. Library of Congress would then require between
1000 and 2000 such disk units.

Digital videodisks storing much more are predicted for
the reasonably near future, but the project is practical with technology
now in hand.  It is time to begin.

Consider the following system.  In addition to existing paper
libraries, there would exist one or more computerized libraries containing
everything that has ever been published, i.e. a computerized version
of the Bibliotheque Nationale.  This library would be accessible
over the telephone network from any computer terminal in the country.
A reader could browse through the library catalog and various
bibliographies just as though he were physically present.  He could read
any book by calling it page by page onto his terminal's screen or he
could have it transmitted to a local printer.
Printers with fixed fonts are now available for a few hundred dollars,
and laser printers that print arbitrary fonts now cost about
$10,000, but will be cheaper

Most office workers would have terminals on their desks, and many
people would have them at home.  At present a good enough terminal
costs about $500, and high quality terminals should cost about
$1,000 if manufactured in moderate quantity.  Most offices can afford
a high quality printer.

Of course, yet better terminals may eventually be available.  We can
imagine a pocket terminal consisting of a rolled up plastic screen
with a 1024 by 1024 array of liquid crystal dots accompanied by
another rolled up pressure sensitive keyboard and a pocket computer
with enough memory to store a book.  Suppose that it has a modular
jack that can  plug into any telephone so that the user can call the
library, scan it for a while and then reload his book memory.  This 
would be nicer than the technology now available, but the available
technology is good enough to justify a start.

>From the user's point of view, the advantages of the computerized
library are the following:
@begin(enumerate)
All books, magazines and newspapers are available.

Anything can be obtained in a few seconds.

Nothing is ever out.

The library is open 24 hours a day 365.2425 days a year.
@end(enumerate)

Many paper libraries would be found unnecessary.  In particular,
university libraries could carry out their functions with much less
money and manpower, since their users would switch to the electronic
library for much of their work.

The establishment of such a system involves many problems and will take
some years, but we will mention some facilities that can and should be
started right away.  Moreover, people who don't have terminals at home now
or on their desks or don't use them at all may be difficult to convince
of the advantages of such a library.

@subsection(Problems)
@begin(itemize)
It is expensive to convert the books to computer readable form.  Equipment
for reading special type fonts is available and reliable.  The recent
Kurzweil equipment reads arbitrary fonts with training but is reported
to rely to a substantial degree on a blind person's ability to know when
something was garbled and try again and on his ability to understand imperfectly
read material.  The lowest error rates are apparently those obtained
by the Information International Grafix I system.  This machine is very
expensive, mainly because it uses obsolete computer hardware, but
the company would update it if the market existed.  Even if much of
the material had to be retyped by hand, the project would be worth what
it would cost.

Of course, much new material is generated in computer-readable form,
but many forms are used, and as yet no-one has developed a system
for putting all this material into a common form.

The copyright law requires permission to put copyrighted material into
computer form.  In my opinion, copyrights should be respected and
suitable financial arrangements based on readership should be negotiated.
Once a computerized library exists, it will be so much more accessible than 
other libraries that authors and publishers will find it to their
advantage to negotiate suitable deals.

The best arrangement might be that the copyright owner could set whatever
price he pleased for reading his material.  The reader could decide whether
or not to pay it.

There is a problem of unauthorized copying.  The problem exists whether a
national library exists or not, and the temptations will increase as
copying machines get more convenient and cheaper and when a general
purpose machine for reading documents from paper to computer files
becomes available.

At present an author gets ten to twenty percent of the retail price
of his books, except that he gets nothing for unsold books and less
for mass market paperbacks.  An electronic publishing system could
afford to give the author eighty percent of the price paid by the
readers, because there would be no physical production or distribution
costs.  This would permit increased income for authors and reduced
prices for the readers.  Presumably there is some price elasticity for
reading that would produce more reading with reduced prices and greater
convenience.  This would greatly reduce the temptation to copy illegally,
since the reader would find it less burdensome to pay the writer his due.

It is likely that the amount of illegal copying would be low enough
so that the system would survive.  If not, we will eventually have
to go to a system where reading is essentially free and writers are 
paid according to a formula by the Government.  This would have many
disadvantages, since no formula could take fully into account the
fact that different writers have different abilities and put different 
amounts of work into books of different kinds.  Of course, 
the present system doesn't take this into account very well either,
but there are some works now that charge very high prices, i.e.
newsletters.  These could still operate outside the standard system.
@end(itemize)

@subsection(Getting Started)

Already there exist numerous databases available by telephone.
Some of them contain bibliographic 
information, i.e. abstracts and references, but others contain the
texts of the material.  Some of them are subsidized by government
grants, e.g. many of the medical databases, and others, e.g. the
legal databases and the "New York Times" Databank, are profit making
businesses.  The charges for using them range from $25 to $200 per
hour except for subsidized customers.

One important step could be taken by the Federal Government.  It
is required by the Freedom of Information Act and other laws to
make very large amounts of information available to the public.
This information would be much more conveniently available 
if it were in a database accessible from anywhere in the country.
This especially includes the Federal Register where all new laws,
regulations, announcements of hearings and requests for comments
are published.

@subsection(Technical Issues)

While it is easy to compute the costs of the storage media, which
are already cheaper than paper, it is harder to calculate the
costs of the computers.  This is because present systems have not
really been optimized for handling very large numbers of users.
It will also be necessary to optimize telephone access.  For this
there are many possibilities.

A daytime cross-country call costs 54 cents for one minute.
In a minute 36,000 bytes can be transmitted at 4800 bits/second.
This means from $7.50 to $15.00 to transmit a book uncompressed
or from $1.87 to $3.75 with a compression of 4.  We can imagine a 
terminal that could store a minute's worth of text and could decompress
it for reading.  These costs are unpleasantly high, but they can be
reduced in various ways.  First, technology permits substantially
lower long distance transmission costs.  Indeed the one minute
transcontinental charge late at night is 16 cents making our
compressed book cost from $.56 to $1.12 if transmitted all at once.
This is probably less than the cost of a trip to a library if one's time
is worth much.  The independent long distance telephone companies are
often 40 percent below AT&T, which brings our optimistic number down to
33 cents, which is reminiscent of the days when pocket books were a
quarter.

We can suppose that the terminal would remember the telephone number and
catalog number and automatically phone for another minute's transmission when the
reader is close to the end of what it has in storage.  These costs
are even less attractive when browsing is wanted.  A solution for that
is to use the European telephone charging system which allows calls
as short as 4 seconds.  Current networks keep the cost for maintaining
a connection down by time-sharing lines, but this doesn't reduce the cost 
of straight data transmission.

An obvious possible saving is to have local libraries with frequently
consulted books and magazines.  With optical fibers and other new means of
transmission, the transmission costs can be brought down to the
point that local libraries will be unnecessary.

@subsection(French Electronic Library)
The time is ripe for it to be socially worthwhile and economically
feasible to put the world literature in the French language into
computer form and make it available world wide.

Image the following system.  The French language literature is put
into computer form, either by optical character recognition machines
or by keyboarding in low wage countries.  A central computer library
in France keeps this literature on the equivalent of about 1000 IBM
3380 disk files.  Three large bandwidth satellites are put up
to provide worldwide transmission facilities.  Reading rooms with
suitable terminals are located in every place where there is
sufficient interest.  A reader can call up any book or other document
from any terminal.  When he does so, the first two pages are 
transmitted via the satellite to the reading room computer and the first
page is displayed on his terminal.  Perhaps the library catalog and 
other currently popular documents are kept in local file.

@section(CURRENT STATUS)

<Mike Griffith to provide>

@section(PLAN FOR RESEARCH)

We propose to undertake the following pilot project.

@begin(enumerate)
A few RA81 disks are acquired from Digital Equipment Corporation and
attached to a VAX computer.  This is currently the most cost-effective
disk file available.

A request for proposals for a few hundred thousand dollars worth of
book input is sent both to keyboarding companies and those that do
optical character recognition.  In addition existing computerized
text is solicited from those who have it for experimental use.
The initial reading list is taken from the public domain literature.

About 20 telephone lines are attached to the VAX, so that the library
is available from existing terminals and micro-computers in the Paris
area.

@begin(multiple)
The necessary programs are written and installed.

At this point a technical demonstration is feasible.  An attempt
is made to determine what is most attractive to the users of the
library within the budget available.
@end(multiple)

An experimental terminal cluster is installed in a reading room in the
Paris area.  It should be a place that is open for a large number of hours.
@end(enumerate)

If the results are encouraging, the second phase includes:
@begin(enumerate)
Giving the computerized library its own computer.

More books.

Obtaining the co-operation of publishers of current books, magazines
and newspapers for an expanded program.  An experimental financial
arrangement should be adopted.

Design of a reading terminal that can be used in connection with the 
French telephone system's electronic yellow pages.

An experimental reading room in an underdeveloped country using
existing satellite transmission channnels.

Developing an optical character recognition system optimized toward
reading books.
@end(enumerate)

The pilot project is intended to lead to a demonstration by
the end of 1984 with several thousand books on line.

@begin(enumerate)
EQUIPMENT PLAN - We expect to start with a VAX with a gigabyte of
memory as the EL Machine located at CMIRH in Paris.  This machine
will have at least 32 lines permitting anyone in the Paris region
with a terminal, personal computer or a Minitel to be able to use it.
By 1985 we hope to extend the service throughout France using the
CMIRH network.

ACQUISITON - There are already several thousand books available at
"----- Le Langru Francais" at Nancy.  We hope to acquire these.
In addition we hope to acquire a similar collection from Britain and
the USA.  Also we will have about 1000 books manually entered in Third
World countries.  This is expected to be quite inexpensive, about 2000
FFr per book.
@end(enumerate)

All these different books will probably come in different formats.
We will develop format conversion programs to put them in CMIRH
standard format.

@i(Representation.)  Information on the disk will be stored in a compact form with
frequently occurring words coded and formatting information bracketed
approximately.

Terminals and personal computers with local processing capability will
receive a decoding program followed by coded text which is expected to
also reduce the transmission time and cost.  Dumb terminals will receive
fully decoded text.  Decoding time should be less than 1 second per
10 words in sequence.

@i(Transmission.)  Initially only serial line transmission will be considered.  VAX will
support up to 19.2 kiloband transmission.  Terminals and personal 
computers with local processing will be able to correct transmission
error using Kermit-like programs.  They can also accept data at much
higher rates for later presentation at user specified rates.

@i(Presentation.)  It will be possible to access information from the on-line library
from almost all commmonly available terminals and personal computers.

However, from an ergonomic (human factors) point of view, high 
resolution bit-mapped displays (equivalent in resolution to the FAX
standard) with a powerful personal computer with at least 2 megabytes
of memory would be highly desirable.  Low cost versions (<$1000)
of such terminals should be available by the end of the decade.
It is expected to take at least that long to acquire and represent
a substantial collection of books, reports and newspapers in electronic
form.

@i(Selection.)  <What books will be on-line in the first year.  Mike Griffiths to 
approach Academe Francais.>


                    A PILOT PROJECT IN ELECTRONIC LIBRARIES

                      J. McCarthy, M. Griffith, R. Reddy

  )

1. SUMMARY

2. PROBLEM
  An  Electronic Library, as the name suggests, is the electronic equivalent of
a conventional library.    It  is  expected  that  when  a  system  capable  of
fulfilling  all  the  functions  of a real library is developed, it can provide
many new functions that  are  currently  unavailable:    instantaneous  access,
multiple access, reduced cost, and language translation aids.

  Constructing  an  operational Electronic Library poses a number of technical,
social and legal problems at present.  The purpose of the Pilot Project  is  to
bypass  many  of these problems for now, and concentrate on a few core problems
which do not require any new technical breakthroughs.  The key problems  to  be
studied  will involve the questions of how to acquire, represent, transmit, and
present Nineteenth Century French Literature.  By selecting  NCFL,  we  finesse
the  issue  of  copyright,  the  problems  of formulas and drawings, and ensure
widespread availability of French Literature.

3. BACKGROUND AND NEED
  For some time it has  been  cost-effective  to  put  the  entire  Library  of
Congress into a computer file and make all its resources available to anyone in
the  country  with  a  computer  terminal.   There is no need to argue that all
printed paper will be abolished, but  I  would  certainly  get  rid  of  ninety
percent  of  my  books  and  magazines if I could access it from my terminal at
home.

  It is now possible to get a gigabyte disk module for under $20,000.    If  we
count  a  book  as  500,000  bytes, then this module can store 2000 books.  The
space occupied by the module would store about 300 books on shelves.  The  cost
comes  to $10 per book.  Recent word information compression would give another
factor of four in storage density, reducing the cost to approximately $2.50 per
book and reducing the storage volume to one twenty-fourth of that  required  to
store books on shelves.

  Books cost much more these days and so does the space required to store them,
although the cost of cataloging the books is apparently larger than either.

  Recently  IBM  (3380)  announced  disk  files storing 2.52 gigabytes per unit
which would store 20,000 compressed books in the space taken by 300 on shelves.
The Library of Congress would then require between  1000  and  2000  such  disk
units.

  Digital  videodisks  storing  much more are predicted for the reasonably near
future, but the project is practical with technology now in hand.  It  is  time
to begin.

  Consider  the  following  system.    In addition to existing paper libraries,
there would exist one or more computerized libraries containing everything that
has ever been  published,  i.e.  a  computerized  version  of  the  Library  of
Congress.  This library would be accessible over the telephone network from any
computer  terminal  in  the country.  A reader could browse through the library
catalog and various bibliographies just as though he were  physically  present.
He could read any book by calling it page by page onto his terminal's screen or
he  could  have  it  transmitted  to  a local printer.  At present, there are a
number of laser printers that print arbitrary fonts for less than $10,000,  but
we can envisage cheaper printers in the future.

  Most  office  workers  would  have  terminals on their desks, and many people
would have them at home.  At present a good enough terminal costs  about  $800,
and high quality terminals should cost about $2,000 if manufactured in moderate
quantity.  Most offices can afford a high quality printer.

  Of  course, yet better terminals may eventually be available.  We can imagine
a pocket terminal consisting of a rolled up plastic screen with a 1024 by  1024
array  of  liquid  crystal  dots  accompanied  by  another  rolled  up pressure
sensitive keyboard and a pocket computer with enough memory to  store  a  book.
Suppose that it has a modular jack that can plug into any telephone so that the
user can call the library, scan it for a while and then reload his book memory.
This  would  be  nicer  than  the  technology  now available, but the available
technology is good enough to justify a start.

  From the user's point of view, the advantages of the computerized library are
the following:

   1. All books, magazines and newspapers are available.

   2. Anything can be obtained in a few seconds.

   3. Nothing is ever out.

   4. The library is open 24 hours a day 365.2425 days a year.

  Many paper libraries would be found unnecessary.  In  particular,  university
libraries  could  carry  out their functions with much less money and manpower,
since their users would switch to the electronic  library  for  much  of  their
work.

  The  establishment of such a system involves many problems and will take some
years, but we will mention some facilities that can and should be started right
away.  Moreover, people who don't have terminals at home now or on their  desks
or don't use them at all may be difficult to convince of the advantages of such
a library.



3.1. Problems

   - It  is  expensive  to  convert  the  books to computer readable form.
     Equipment for reading special type fonts is available  and  reliable.
     The recent Kurzweil equipment reads arbitrary fonts with training but
     is  reported  to  rely  to  a  substantial degree on a blind person's
     ability to know when something was garbled and try again and  on  his
     ability  to  understand  imperfectly read material.  The lowest error
     rates are apparently those obtained by the Information  International
     Grafix  I  system.  This machine is very expensive, mainly because it
     uses obsolete computer hardware, but the company would update  it  if
     the  market  existed.  Even if much of the material had to be retyped
     by hand, the project would be worth what it would cost.

   - Of course, much new material is generated in computer-readable  form,
     but many forms are used, and as yet no-one has developed a system for
     putting all this material into a common form.

   - The  copyright  law  requires  permission to put copyrighted material
     into computer form.  In my opinion, copyrights  should  be  respected
     and  suitable  financial  arrangements  based on readership should be
     negotiated.  Once a computerized library exists, it will be  so  much
     more accessible than other libraries that authors and publishers will
     find it to their advantage to negotiate suitable deals.

   - The  best  arrangement  might  be  that the copyright owner could set
     whatever price he pleased for reading his material.  The reader could
     decide whether or not to pay it.

   - There is a problem of  unauthorized  copying.    The  problem  exists
     whether  a  national  library exists or not, and the temptations will
     increase as copying machines get more convenient and cheaper and when
     a general  purpose  machine  for  reading  documents  from  paper  to
     computer files becomes available.

   - At  present  an author gets ten to twenty percent of the retail price
     of his books, except that he gets nothing for unsold books  and  less
     for  mass  market  paperbacks.  An electronic publishing system could
     afford to give the author eighty percent of the  price  paid  by  the
     readers,   because   there   would   be  no  physical  production  or
     distribution costs.  This would permit increased income  for  authors
     and  reduced  prices for the readers.  Presumably there is some price
     elasticity for reading that would produce more reading  with  reduced
     prices  and  greater  convenience.    This  would  greatly reduce the
     temptation to copy illegally, since the reader  would  find  it  less
     burdensome to pay the writer his due.

   - It  is  likely that the amount of illegal copying would be low enough
     so that the system would survive.  If not, we will eventually have to
     go to a system where reading is essentially free and writers are paid
     according to a formula by the  Government.    This  would  have  many
     disadvantages,  since  no  formula  could take fully into account the
     fact  that  different  writers  have  different  abilities  and   put
     different  amounts of work into books of different kinds.  Of course,
     the present system doesn't take this into account very  well  either,
     but  there  are  some  works  now  that charge very high prices, i.e.
     newsletters.  These could still operate outside the standard system.



3.2. Getting Started
  Already there exist numerous databases available by telephone  from  anywhere
in the country.  Some of them contain bibliographic information, i.e. abstracts
and references, but others contain the texts of the material.  Some of them are
subsidized  by  government  grants,  e.g.  many  of  the medical databases, and
others, e.g. the legal databases and the "New York Times" Databank, are  profit
making  businesses.  The charges for using them range from $25 to $200 per hour
except for subsidized customers.

  One important step could be taken by the Federal Government.  It is  required
by  the Freedom of Information Act and other laws to make very large amounts of
information available to the public.   This  information  would  be  much  more
conveniently available if it were in a database accessible from anywhere in the
country.    This  especially  includes the Federal Register where all new laws,
regulations, announcements of hearings and requests for comments are published.



3.3. Technical Issues
  While it is easy to compute the costs of the storage media, which are already
cheaper than paper, it is harder to calculate the costs of the computers.  This
is because present systems have not really been  optimized  for  handling  very
large  numbers  of  users.    It  will  also be necessary to optimize telephone
access.  For this there are many possibilities.

  A daytime cross-country call costs 54 cents for one  minute.    In  a  minute
36,000  bytes can be transmitted at 4800 bits/second.  This means from $7.50 to
$15.00 to  transmit  a  book  uncompressed  or  from  $1.87  to  $3.75  with  a
compression  of 4.  We can imagine a terminal that could store a minute's worth
of text and could decompress it for reading.    These  costs  are  unpleasantly
high,  but  they  can  be  reduced  in various ways.  First, technology permits
substantially lower long distance transmission costs.  Indeed  the  one  minute
transcontinental  charge  late  at night is 16 cents making our compressed book
cost from $.56 to $1.12 if transmitted all at once.  This is probably less than
the cost of a trip to a library if one's time is worth much.   The  independent
long distance telephone companies are often 40 percent below AT&T, which brings
our  optimistic  number down to 33 cents, which is reminiscent of the days when
pocket books were a quarter.

  We  can  suppose  that  the  terminal would remember the telephone number and
catalog number and automatically phone for another minute's  transmission  when
the reader is close to the end of what it has in storage.  These costs are even
less  attractive  when  browsing  is wanted.  A solution for that is to use the
European telephone charging system which allows calls as short  as  4  seconds.
Current   networks   keep  the  cost  for  maintaining  a  connection  down  by
time-sharing  lines,  but  this  doesn't  reduce  the  cost  of  straight  data
transmission.

  An  obvious  possible  saving  is  to  have  local  libraries with frequently
consulted books and magazines.  With optical fibers  and  other  new  means  of
transmission,  the  transmission  costs  can  be brought down to the point that
local libraries will be unnecessary.



3.4. French Electronic Library
  The time is ripe for it to be socially worthwhile and  economically  feasible
to  put the world literature in the French language into computer form and make
it available world wide.

  Image the following system.  The  French  language  literature  is  put  into
computer   form,  either  by  optical  character  recognition  machines  or  by
keyboarding in low wage countries.  A central computer library in France  keeps
this  literature  on  the  equivalent of about 1000 IBM 3380 disk files.  Three
large bandwidth  satellites  are  put  up  to  provide  worldwide  transmission
facilities.    Reading rooms with suitable terminals are located in every place
where there is sufficient interest.  A reader can call up  any  book  or  other
document  from  any  terminal.    When  he  does  so,  the  first two pages are
transmitted via the satellite to the reading room computer and the  first  page
is  displayed on his terminal.  Perhaps the library catalog and other currently
popular documents are kept in local file.

4. CURRENT STATUS
  <Mike Griffith to provide>

5. PLAN FOR RESEARCH
  We propose to undertake the following pilot project.

   1. A few RA81 disks are acquired from Digital Equipment Corporation and
      attached  to  a  VAX  computer.    This  is   currently   the   most
      cost-effective disk file available.

   2. A  request for proposals for a few hundred thousand dollars worth of
      book input is sent both to keyboarding companies and those  that  do
      optical  character  recognition.   In addition existing computerized
      text is solicited from those who have it for experimental use.   The
      initial reading list is taken from the public domain literature.

   3. About  20  telephone  lines  are  attached  to  the VAX, so that the
      library is available from existing terminals and micro-computers  in
      the Paris area.

   4. The necessary programs are written and installed.

      At  this point a technical demonstration is feasible.  An attempt is
      made to determine what is  most  attractive  to  the  users  of  the
      library within the budget available.

   5. An  experimental  terminal cluster is installed in a reading room in
      the Paris area.  It should be a place  that  is  open  for  a  large
      number of hours.

  If the results are encouraging, the second phase includes:

   1. Giving the computerized library its own computer.

   2. More books.

   3. Obtaining the co-operation of publishers of current books, magazines
      and  newspapers  for an expanded program.  An experimental financial
      arrangement should be adopted.

   4. Design of a reading terminal that can be used in connection with the
      French telephone system's electronic yellow pages.

   5. An experimental reading room  in  an  underdeveloped  country  using
      existing satellite transmission channnels.

   6. Developing  an optical character recognition system optimized toward
      reading books.

  The pilot project is intended to lead to a demonstration by the end  of  1984
with several thousand books on line.

   1. EQUIPMENT  PLAN  -  We expect to start with a VAX with a gigabyte of
      memory as the EL Machine located at CMIRH in Paris.    This  machine
      will  have  at  least 32 lines permitting anyone in the Paris region
      with a terminal, personal computer or a Minitel to be  able  to  use
      it.    By 1985 we hope to extend the service throughout France using
      the CMIRH network.

   2. ACQUISITON - There are already several thousand books  available  at
      "-----  Le Langru Francais" at Nancy.  We hope to acquire these.  In
      addition we hope to acquire a similar collection  from  Britain  and
      the  USA.    Also  we will have about 1000 books manually entered in
      Third World countries.  This is expected to  be  quite  inexpensive,
      about 2000 FFr per book.

  All  these  different books will probably come in different formats.  We will
develop format conversion programs to put them in CMIRH standard format.

  Representation. Information on the disk will be stored in a compact form with
frequently  occurring  words  coded  and   formatting   information   bracketed
approximately.

  Terminals  and  personal  computers  with  local  processing  capability will
receive a decoding program followed by coded text which  is  expected  to  also
reduce  the  transmission  time  and  cost.   Dumb terminals will receive fully
decoded text.  Decoding time should be less than  1  second  per  10  words  in
sequence.

  Transmission.  Initially  only  serial  line transmission will be considered.
VAX will support up to 19.2 kiloband  transmission.    Terminals  and  personal
computers  with  local  processing  will  be able to correct transmission error
using Kermit-like programs.  They can also accept data at much higher rates for
later presentation at user specified rates.

  Presentation. It will be possible to  access  information  from  the  on-line
library from almost all commmonly available terminals and personal computers.

  However,  from  an  ergonomic  (human factors) point of view, high resolution
bit-mapped displays (equivalent in resolution  to  the  FAX  standard)  with  a
powerful  personal computer with at least 2 megabytes of memory would be highly
desirable.  Low cost versions (<$1000) of such terminals should be available by
the end of the decade.  It is expected to take at least that  long  to  acquire
and  represent  a  substantial  collection  of books, reports and newspapers in
electronic form.

  Selection. <What books will be on-line in the first year.  Mike Griffiths  to
approach Academe Francais.>

                               Table of Contents
1. SUMMARY                                                                    0
2. PROBLEM                                                                    0
3. BACKGROUND AND NEED                                                        0
     3.1. Problems                                                            0
     3.2. Getting Started                                                     0
     3.3. Technical Issues                                                    0
     3.4. French Electronic Library                                           1
4. CURRENT STATUS                                                             1
5. PLAN FOR RESEARCH                                                          1